Disk { Tape Joins : Synchronizing Disk and Tape
نویسندگان
چکیده
Today large amounts of data are stored on tertiary storage media such as magnetic tapes and optical disks. DBMSs typically operate only on magnetic disks since they know how to maneuver disks and how to optimize accesses on them. Tertiary devices present a problem for DBMSs since these devices have dismountable media and have very diier-ent operational characteristics compared to magnetic disks. For instance, most tape drives ooer very high capacity at low cost but are accessed sequentially, involve lengthy la-tencies, and deliver lower bandwidth. Typically, the scope of a DBMS's query optimizer does not include tertiary devices , and the DBMS might not even know how to control and operate upon tertiary-resident data. In a three-level hierarchy of storage devices (main memory, disk, tape), the typical solution is to elevate tape-resident data to disk devices , thus bringing such data into the DBMS' control, and then to perform the required operations on disk. This requires additional space on disk and may not give the lowest response time possible. With this challenge in mind, we studied the trade-oos between memory and disk requirements and the execution time of a join with the help of two well-known join methods. The conventional, disk-based Nested Block Join and Hybrid Hash Join were modiied to operate directly on tapes. An experimental implementation of the modiied algorithms gave us more insight into how the algorithms perform in practice. Our performance analysis shows that a DBMS desiring to operate on tertiary storage will beneet from special algorithms that operate directly on tape-resident data and take into account and exploit the mismatch in disk and tape characteristics. 1 Introduction Over the last few years, a need to record and process vast quantities of data has emerged in business environments as well as in scientiic settings. Since tertiary devices ooer lower cost per megabyte and smaller footprint than magnetic disks, large volumes of data are routinely stored on
منابع مشابه
Virtual Tape Libraries: The Best of Tape and Disk Backup
Tape backup has traditionally been the mainstay of enterprise data protection when long-term data protection is required. As disk technologies have improved and economies of scale have driven down prices, disk adoption rates have surged and it would appear disk is poised to eclipse tape as the dominant backup platform and relegate tape to a minor archival and disaster recovery role. Numerous su...
متن کاملTape-Disk Join Strategies under Disk Contention
Large-scale data warehousing, data mining, and scientific applications require the analysis of terabytes of facts data accumulated over long periods of time. Tape libraries are suitable devices for storing such mass data. The online analytical processing (OLAP) of this data typically leads to long-running aggregation queries joining the tape-resident facts relation with disk-resident dimension ...
متن کاملDatabase backups using virtual tape volumes
The time it takes for the database administrator to conduct the database recovery process directly affects the company’s bottom line. Database downtime impact can be measured in millions of dollars resulting in lost opportunity and disgruntled customers. This paper describes virtual tape technology and advantages which can be realized by the database administrator when database recovery efficie...
متن کاملInfluence of Technology on Magnetic Tape Storage Device Characteristics
There are available today many data storage devices that serve the diverse application requirements of the consumer, professional entertainment, and computer data processing industries. Storage technologies include semiconductors, several varieties of optical disk, optical tape, magnetic disk, and many varieties of magnetic tape. In some cases, devices are developed with specific characteristic...
متن کاملA Cost-effective Near-line Storage Server for Multimedia System
In this paper, we consider a storage server architecture for multimedia information systems. While most other works on multimedia storage servers assume on-line disk storage [12, 11, 15, 8, 2], we consider a two-tier storage architecture with a robotic tape library as the vast near-line storage and on-line disks as the front-line storage. Magnetic tapes are cheaper, more robust, and have a larg...
متن کامل